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Abstract 

Background: Neutralization sensitivity of HIV-1 virus to antibodies and anti-sera varies greatly between the isolates. 
Significant role of V1/V2 donnain as a global neutralization sensitivity regulator has been suggested. Recent X-ray structures 
revealed presence of well-defined tertiary structure within this domain but also demonstrated partial disorder and 
conformational heterogeneity. 

Methods: Correlations of neutralization sensitivity with the conformational propensities for beta-strand and alpha-helix 
formation over the entire folded VI /V2 domain as well as within sliding 5-residue window were investigated. Analysis was 
based on a set of neutralization data for 106 HIV Isolates for which consistent neutralization sensitivity measurements 
against multiple pools of human immune sera have been previously reported. 



/feiu/ts; Significant correlation between beta-sheet formation propensity of the folded segments of V1/V2 domain and 
neutralization sensitivity was observed. Strongest correlation peaks localized to the beta-strands B and C. Correlation 
persisted when subsets of HIV Isolates belonging to clades B, C and circulating recombinant form BC where analyzed 
Individually or in combinations. 

Conclusions: Observed correlations suggest that stability of the beta-sheet structure and/or degree of structural disorder in 
the VI /V2 domain is an important determinant of the global neutralization sensitivity of HIV-1 virus. While specific 
mechanism is to yet to be Investigated, plausible hypothesis Is that less ordered Vl/V2s may have stronger masking effect 
on various neutralizing epitopes, perhaps effectively occupying larger volume and thereby occluding antibody access. 
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Background 

Neutralization by antibodies, along with cellular immunity, is a 
key defense mechanism against viral infection. Most clinical 
isolates of HIV-1 virus are notoriously difficult to neutralize by 
antibodies. This resistance is contributing to both, the inability of 
human immune system to control HIV infection in the vast 
majority of individuals and the fact that despite decades of 
concerted efforts to create an effective prophylactic HIV vaccine, 
only a rather limited success has been reported so far (vaccine trial 
RV144 in Thailand) [1]. Apart from the common viral resistance 
mechanisms of evasion via frequent mutations, HIV appears to 
have evolved highly efficient ways of 'hiding' vulnerable conserved 
immunogenic structures. The only viral proteins exposed on the 
HIV particles are the envelope glycoprotein ('env') gpl20/gp41 
trimeric spikes which mediate host cell attachment and fusion [2] . 
The spikes contain conserved interfaces and other structures that 
are necessary for receptor (CD4) [3] and co-receptor (CCR5 or 
CXCR4) binding [4] and eventual fusion. However, the virus 



appears to disguise these vulnerable targets from the host's 
immune system under a heavy glycosylation layer [5], behind 
highly variable elements [6], within narrow crevasses of the 
structure that are poorly accessible to antibodies, and using other 
mechanisms of epitope 'masking' [7] that are still poorly 
understood. 

Yet this resistance varies greatly between different virus isolates, 
and a 'Tier' system has been proposed to classify HIV strains and 
to provide a virus panel for objective evaluation of immune sera 
and monoclonal antibodies in terms of their neutralization 
potency. Importantly, strains that resist neutralization often do 
so across multiple antibody types targeting different epitopes. 

In principle, neutralization resistance variations should be 
determined by env sequence and ultimately by the structure and 
dynamics of the spike. It has been proposed that 'intrinsic 
reactivity' of the env trimer, i.e. its propensity to undergo 
conformational transition to lower-energy states from the initial 
native state, provides an important contribution to global inhibition 
sensitivity [8]. However, no general sequence-structure-function 
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(i.e. resistance) relationships have been established so far, although 
singular mutations that dramatically alter resistance have been 
reported [5,9,10]. 

Intriguingly, it was demonstrated that V 1 / V2 region of gp 1 20 is 
an important determinant of the overall neutralization sensitivity 
of the HIV- 1 : modifications and deletions often increase neutral- 
ization sensitivity [6,11], and swapping the VI /V2 sequence of a 
neutralization-sensitive virus for a VI /V2 from a resistant one 
conferred neutralization-resistant phenotype, and conversely 
[12,13]. Binding experiments and mathematical modeling allowed 
dissection of V1/V2 masking effects on the V3 loop [14]. Some 
controversy exist as to whether VI /V2 and V3 interactions are 
inter- or intra- protomer: mathematical modeling approach 
indicates interactions in trans (i.e. between neighboring subunits) 
[14] while different mixed trimer expression experiments suggest 
that V3 masking occurs within each protomer (in cis) rather than 
between protomers [15]. Possibly both mechanisms coexist [16]. 

Until recently, little has been known about the structure of VI/ 
V2 domain and the two segments in it delineated by disulfide 
bridges were viewed as 'loops.' V1/V2 received limited attention 
in vaccine development efforts because of its high variability and 
apparent limited functional importance - VI /V2 deleted virus 
often remains replication competent [17]. The region was 
truncated out of all gpl20 'core' structures solved by X-ray 
cr)'stallography to date. The interest in the region soared when 
broadly neutralizing antibodies targeting VI /V2 were reported 
[18] and soon thereafter crystal structure of V1/V2 domain was 
solved in complex with broadly neutralizing monoclonal antibody 
(BNAb) PG9 [19]. In diis structure, V1/V2 domain was grafted on 
an unrelated scaffold protein. The X-ray structure established that 
VI /V2 domain is organized into a compact 4-strand anti-parallel 
beta-sheet fold (Fig 1). The antibody was observed to interact 
primarily with strand 'C and glycans (Fig. 2a). Subsequentiy 
solved structure of the closely related BNAb PG16 complex 
exhibits a very similar binding mode but more extensive glycan 
interactions [20]. Next, two complexes of antibodies (CH58 and 
CH59) with linear epitope peptide fragments of VI /V2 were 
solved [21]. Low resolution electron microscopy (EM) structure 
provided first unambiguous data on the localization of VI /V2 
domain within the trimeric gpl20 spike [22]. Certain EM 
structures suggest that VI /V2 and V3 loops form a 'trimer 
association domain' (TAD) near the apex of the gpl20 spike 
(Figure 2d) [23]. Most recently, first medium-resolution trimeric 
X-ray structure of the so-called SOSIP gpl40 has been reported 
[24], and the VI /V2 conformation in context of this trimer 
corresponded well to that observed in PG9/PG16 complexes, 

Antibody responses to VI /V2 in the sera correlated with 
protection in the RV144 vaccine trial in Thailand [25], the only 
HIV vaccine trial that demonstrated (limited) efficacy [1]. 
Antibodies induced by the vaccine targeted conserved mid-region 
of the V2 loop [26] . Most recentiy, 'sieve' analysis demonstrated 
that substitutions at two positions in VI / V2 correlate strongly with 
susceptibility of the virus to vaccine protection in RV144 trial [27]. 
Immunodominant site on V2 was revealed by epitope mapping of 
conformational V2-specific human antibodies [28], and epitopes 
of 8 mouse antibodies were localized to the C-strand termini [29] . 
Additionally, polyclonal antibodies in a broadly neutralizing blood 
serum from an AIDS patient (coded CAP256) were shown to 
target central residues of the BC hairpin [30]. 

Remarkably, linear epitopes of the niAbs CH58 and CH59, 
which localize to the stretch of the VI / V2 sequence corresponding 
to the C-strand in mAb PG9 complexes, are structurally distinct: 
CH58 binds a largely alpha-helical conformation of the antigen. 



while in CH59 complex it is a coil with elements of 3-10 helical 
and extended conformations (Figs. 2b and 2c). 

Can structural variability of VI /V2 domain underlie its 

influence on the neutralization sensitivity (NS) of the HIV strains? 
In the absence of the structural data for a range of VI /V2 
domains from different strains and in the native context, it may 
still be possible to tease out certain structural correlates from 
simple serjuence-based biophysical descriptors. Since the possibil- 
ity of secondary structure switch between beta-strand, coil and/ or 
alpha helix is indicated by the X-ray structures, propensities to 
form alpha-helix and beta-strand are of particular interest. While 
all 20 naturally occurring amino-acids are found within all types of 
secondary structures, preferences towards one or another structure 
that vary significantly between the amino-acids are well-estab- 
lished [31,32], both via statistical analysis of protein structures [33] 
as well as via energy calculations [34] and measurements [35]. 
Accordingly, a number of beta-strand (alpha-helix) propensity 
scales have been proposed, attributing an energy cost/gain of 
having a particular amino-acid type within a strand (or helix), 
usually relative to alanine [36,37]. Total propensity calculated 
over a stretch of polypeptide sequence can provide a measure of its 
preference for a particular secondary structure state. Herein, an 
investigation into correlations of such propensities for VI /V2 loop 
domain of gpl20 and certain segments within it with virus 
sensitivity to antibody neutralization is presented. 

Results and Discussion 

Overall beta-sheet propensity of V1/V2 domain correlates 
with neutralization sensitivity 

Correlation of the neutralization sensitivity for 106 HIV isolates 
on the tiered neutralization assessment panel [38] with structural 
propensities of the VI /V2 domain was analyzed. Total beta-sheet 
propensity (BSP) across 60 well-aligned amino-acid positions 
(Fig. 3) within the VI /V2 domain and its stem correlated with the 
logio of neutrahzation ID50 (50% inhibitory dose) by HIVIG 
(HIV immune globulins). Pierson correlation coefficient R was 
0.34. While modest, correlation was highly statistically significant 
with the p-value estimate of 0.0003. For comparison, when the 
same calculation was performed for alpha-helical propensity, 
R= —0.14 with p-value estimate of 0.15 was obtained. 

Localization of BSP/neutralization sensitivity correlation 
'hotspots' within VI /V2 domain 

To investigate whether there were particular regions within VI / 
V2 where the correlation was localized, total propensities within a 
sliding window of 5 amino-acids around each position were 
calculated and correlations with logio ID50 evaluated. Short 5- 
residue segments rather than individual residues were analyzed 
because secondary structure formation is a cooperative process 
and therefore total propensities within a window are expected to 
be more structurally meaningful and less noisy. For 1 5 positions p- 
values of the correlations reached statistical significance (p<0.05). 
Importantiy all of them had the same (positive) sign of the 
correlation, i.e. lower ID50s correlated with more negative 
propensity (i.e. more stable beta-strand structures). When plotted 
along the amino-acid sequence, two peaks of logID50/beta 
propensity correlation centered at positions T163 and Y177 were 
observed (p-values of 0.007 and 0.01), as well as a weaker peak at 
SI 97 (p = 0.02)(Figure 4). Notably, the last segment extends 
beyond the end of V2 proper into VI /V2 stem. For comparison, 
same analysis was performed with alpha-helix propensities. 
Weaker correlations with oscillating sign were observed and only 
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Figure 1. Schematic secondary and tertiary structure organization of the V1/V2 domain in HIV gp120. Disulfide bridges are shown in 
yellow. Beta-strands(shown as block arrows) are assigned according to the X-ray structure (PDB ID 3U4E). In this structure stem region is replaced by 
an unrelated scaffold, but the stem is independently observed to form anti-parallel strands in multiple gp120 X-ray structures (e.g. PDB ID 2B4C). 
Dashed line indicates polypeptide chain segment unresolved in the available X-rays although N-terminal part of it has well-conserved sequence, 
including the integrin binding motif. 
doi:1 0.1 371/journal.pone.0094002.g001 
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Figure 2. Available X-ray structures of V1/V2 domain and its fragments, (a) Complex of the V1/V2 domain (blue ribbon, glycans shown in 
ball-and-stick representation; parts interacting with the FAb are in magenta) grafted on a scaffold protein (not shown) and the BNAb PG9 (grey 
ribbon), (b) and (c) VI /V2 fragments (magenta) in complex with mAb CH58 and CH59, respectively (grey ribbon). Structures were visualized from 
Protein Data Bank entries 3U4E, 4HP0 and 4HPY. (d) Visualization of EM electron density map (EMDataBank, www.emdatabank.org, accession 
no. EIVID-5447) of the trimeric env spike. Approximate localization of the V1/V2/V3 loops (forming presumed 'TAD') is delineated in red. 
doi:1 0.1 371 /journal.pone.0094002.g002 
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A1EAI4_9HIV1 CVf^LTPLCVTLECSN VT YNESHKfVKNCSFNLTTEL 

A1EAI0_9HIV1 CVKLTPLCVTLNCSKAK NITEEVIK NNTYKEDIRNCSFNATTEV 

A1EAI6_9HIV1 CVKLTPLCVTLDCENVD GNDTY NGTNBmKNCSFNTTTEL 

A1EAI7_9HIV1 CVKLTPLCVTLDCANVT SNITN GEE IKNCSFNATTDV 

A1EAI2_9HIV1 CVKLTPICVTLECTDANITCNSTTSSNNCTSYEI NKEDMGEIKNCSFNTTTEL 

A1EAI1_9HIV1 CVKLTPLCVTLECRQVNTTNA-TSSVNVT NGEHilKNCSFNATTE I 

A1EAH9_9HIV1 CVKLTPLCVTLECTQVNATQGNTTQVNVTQV NGDgMKNCSFNTTTE I 

A1EAI3_9HIV1 CVKLTPLCVTLECR NATSKMVNDTR KVEBMKNCSFNTTTEL 

A1EAH8_9HIV1 CIKLTPLCVTLECTNVNIING TI HNETYESMKNCSFNTTTEL 

Q6TD46_9HIV1 CVKLTPLCVTLNCSKLN NATDGBMKNCSFNATTEL 

Q202J8_9HIV1 CVKLTPLCVTLNCSDVKI — KGTNATYNNATYNN NNTI-SDMKNCSFNTTTEI 

Q202K4_9HIV1 CVKLTPLCVTLNCTN APAYNN SMHGBHKNCSFNTTTEI 

Q202J5_9HIV1 CVKLTPLCVTLNCKNVNISANANATATLNS SMNGBlKNCSFNTTTEL 

Q202K7_9HIV1 CVKLTPLCVTLNCTDVKVNATSNGTTTYNNSI DSMNGBIKNCSFNITTEI 

Q202K1_9HIV1 CVKLTPLCVTLNCVT — YNNSMNSSATYNN SMNGpiKNCSFNTTTEL 

Q1PHM9_9HIV1 CVKLTPLCVTLRC TNATING SLTEBVKNCSFNITTEL 

Q6TCV7_9HIV1 CVKLTPLCVTLECKN ATRSNQTTYYDN MDKglKNCSFNVTTEL 

Q27Q59_9HIV1 CVKLTPLCVTLDCST YNNT HNISK^MKICSFNMTTEL 

Q1PHM6_9HIV1 CVKLTPLCVTLNCSD ATYNNG TNSTDTMKICSFNATTEL 

A1EAI5_9HIV1 CVKLTPLCVTLNCTDVNKNVSSSDTDNYKETH KgRKNCTFNMTTEL 

Q27Q69_9HIV1 CVKLTPLCVTLNCSDATSNTTKNATNTNTTSTDNRN-AT — SNDTEMKG|iKDCTFNITTEV 

Q6TCP8_9HIV1 CVKLTPLCVTLNCTSPAAH NESETR--VKHCSFNITTDV 

Q27Q7 4_9HIV1 CVKLTPLCVTLNCNNVNVTHNST YNNTEGEQIKNCSFNITTEL 

Q27Q64_9HIV1 CVKLTPLCVTLNCSNVNINETS IDF-NV — TSNISMKEflMKNCSFKVNSEL 

A1EAH4_9HIV1 CVKLTPLCVTLECSKVSN--NETDKYNGTE fiMKNCSFNATTW 

A1EAH5_9HIV1 CVKLTPLCVTLKCRNVSDSRNGNSTYNESEQ SmKNCSFNVTTIL 

A1EAH2_9HIV1 CVKLTPLCVTLKCKNVNSTGKNVTSTSNGTYNS TYR--EKMKNCSFNATTAL 

A1EAH3_9HIV1 CVKLTPLCVTLNCTNATRSSDDTRNSTENI EGMKNCSFNATTVV 

A1EAH6_9HIV1 CVKLTPLCVTLECSNVTLERSNVSSSNDTQNRT HHESIKEHKNCSFNATTEL 

A1EAH1_9HIV1 CVKLTPLCVTLECKNVSRNNSSSNGTGTNNEA YREGMKBIKNCSFNATAEL 

AOMTLO_9HIV1 CVKLTPLCVTLKCNDV NSTSNGTSNGTD HKNMNBMKNCSFNTTTEL 

A1EAH0_9HIV1 CVKLTPLCVTLDCKDVS SKNVSS RNETYQiMKNCSFNATTVV 

A1EAH7_9HIV1 CVKLTPLCVTLECRNVNV SCNSND TCKENERMTNCSFNTTTVL 

A1EAG7_9HIV1 CVKLTPLCVTLECSNVTKIDTSKGKIVNSDNK IVNSTEBIKNCSFNATTVL 

A1EAG8_9HIV1 CVKLTPLCVTLKCKDVSINNGNVSSSNGSTSHNNSSIDN — ETLNEGMKBMKNCSFNATTVL 

B2YFU9_9HIV1 CVKLTPLCVTMQCTDMTNISIKLASIANTS lAEETQG-jjjlKNCSFNV-TEL 

B2YFP6_9HIV1 CVKLTPLCVTMQCTDMTNISIKLASIANTS lAEEMQG-BlKNCSFNV-TEL 

B2YFU4_9HIV1 CVKLTPLCVTLLCKDVNN STNKTT lYKEMNG-jSlKNCSFNMTTEL 

B2YFV9_9HIV1 CVQLTPLCVTLECADAQNV TDTNTT ISNEMQG-BIKNCSFNMTTEL 

B2YFT9_9HIV1 CVQLTPLCVTLHCRN TNAS SN TSGDMQGEEMKNCSFNVTAEL 

B2YFS0_9HIV1 CVKLTPLCVTLNCSNFSGD-NSTN TS VSDDMQG-BIKNCSFNVTTGL 

B2YFV4_9HIV1 CVKLTPLCVTLHCHNFSNT-SSSNSLHSST VSPQMRG-t IQNCSFNTTTLL 

O89960_9HIVl CVKLTPLCVTLDCHNVNSSNSSTSNSSNSSTPINR T — IDSDMQE-BIKNCSFNMTTEL 

B2YFN6_9HIV1 CVKLTPLCVTLNCVNSNFTNNSSNSSNNS NIDMQE-BIKNCSFNMTTEL 

B2YFR0_9HIV1 CVKLTPLCVTLDCQAFNSSSHTNSSIA MQ-pMKNCSFNVTTEL 

B2YFP1_9HIV1 CVQLTPLCVTLNCSNYNGSSTSNSTSPPPATSFN DTSMIG-|lKNCSYNMTTEL 

B2YFQ5_9HIV1 CVKLTPLCVTLDCIDIRNNTDNIT VDSNMKG-BITNCSFNMTTEL 

B2YFQ1_9HIV1 CVKLTPLCVTLNCTDVRNRTLNYTYNNSTSNISL VSSDMEG-llKNCSFNMTTEL 

B2YFT0_9HIV1 CVKITPLCVTLNCIDANFNSSNSSLHSSLSS SSHIGEHQMKNCSFNVTTEL 

B2YFT4_9HIV1 CVKLTPLCVTLNCTKATHTGDN ST SVNLLKV||MQNCSYNMTTEL 

B2YFN1_9HIV1 CVKLTPLCVTLNCT NATTN GTIED-GAKNCSFNMTTEL 

B2YFM7_9HIV1 CVQLTPLCVTLNCSSVSNSSSSNSTYNITVS QSMEG-|?IKNCSFNMTTEL 

B2YFR5_9HIV1 CVKLTPLCVTLSCATVNSTKTNNTSGNITGS EQM RNCSFNMTTEV 

D7S0Q8_9HIV1 CVRLTPLCVTLNCTDYKGNY TNYGTG-HN — ITSDMEG-BlKDCSFNVTTEL 

Q8JDL4_9HIV1 CVKLTPLCVTLDC YNVTKS-DK — ITKDMQE-EIKNCSFNITTEL 

D7 S 1 9 4_9 HIVl CVQLTPLCVTLNCNDVTNNG TNVA N — ISIDMKG-gllKNCSFNMTTEL 

D7S0R6_9HIV1 CVKLTPLCVTLNCGKVKN DTRD-IlRNCSYNMTTEL 

Q8JDJ1_9HIV1 CVKLTPLCVTLNCSNSNNI PSV-SN — ITDDMKE-BlKNCSFNMTTEL 

Q8JDI3_9HIV1 CVKLTPLCVTLDCNNVTN N — GTSDMRE-glKNCSFNMTTEL 

D7S165_9HIV1 CVKLTPLCVTLQCS DVTSNGTNS NV — TNRE BmKNCSFNVTTE 1 

D7S0R9_9HIV1 CVKLTPLCVTLNCSHNITRLNTSSPNITGNDTNSSP-NI — TENDTEN-BRKNCSFNITTEL 

05577 4_9HIV1 CVKLTPLCVTLHCT NVTSVNTTG DREG LKNCSFNMTTEL 

D7S169_9HIV1 CVQLTPLCVTLNCNH NVTTNNNTN ITSEMRE-EIRNCSFNVTTVQ 

Q8JDJ9_9HIV1 CVKLTPLCVTLNCTDWTNNA TSTNQ-TT — PATSEET-GVKNCSFNITTEL 

A4ZPX0_9HIV1 CVKLTPLCVTLKCTNVN STTSTNSSEEG 1 GEHKNCSFNITTSV 

A7KVY7_9HIV1 CVKLTPLCVTLNCTDLG NVTNTTNSNGE MME-KGBVKNCSFKITTDI 

A4ZPW5_9HIV1 CVKLTPLCVNLNCTDVKI NSTSPNNTVGSSINSS MIE jsiMKNCSFNISTKI 

A4ZPW9_9HIV1 CVKLTPLCVTLNCTDLNV TKTDLKVTKTNNSSET TIE-KGBMKNCSFNVTKDW 

Q5G5U4_9H1V1 CVKLTPLCVTLNCSDVNTTSV — NTTASSM E-GGglKNCSFNTTTSM 

Q5G5U8_9HIV1 CVKLTPLCVTLNCTDYNNTAT — NTTSSATTTASSANKT AKE-EAVMKNCSFNITTNV 

Q6EG58_9HIV1 CVKLTPLCVTLNCTDVTNATN INATNINNSSGG VE-SGBIKNCSFNITTSV 

B0FBH2_9HIV1 CVKLTPLCVTLNCTDLNNTNN — DNGISSSNNSNSWGK ME-RGBIKNCSFNITTSI 

Q5G5V4_9HIV1 CVKLTPLCVTLNCTDYFGNTT — NNSS R-EAMMKKCSFNITTNI 

A7KVX0_9HIV1 CVKLTPLCVTLNCTDYWNTNN — TNTTASTTTTSTPTSSNIGGMHK-KGBIQQCSFNITTTI 

Q4QXE8_9HIV1 CVKLTPLCVTLNCTELKNSTD--TNLGTQ BMKNCSFNITTSV 

Q2MKA8_9HIV1 CVKLTPLCVTLNCTDLKNATN--GNNTNTTSS SGGMMG-GGBMKNCSFNITTNI 

Q5G5V2_9HIV1 CVKLTPLCVTLNCSDLRNATN — TTNPTVSSRVI KKEMM GiSVKNCSFNVTTDI 

Q5G5U7_9HIV1 CVKLTPLCVTLKCTDL N--VTNSNSTDHST NSSLEA-KGglKNCSFNITTTP 

B0FBD4_9HIV1 CVKLTPLCVTLNCT NATSTNFTAKN-EGGIKNCSFNITTER 

B0FBG7_9HIV1 CVKLTPLCVTLTCTDYEWNCTGIRNSICKYNNMTNNSSSGNYTGWE-RGilKNCSFNSTISG 

Q5G5V8_9HIV1 CVRLTPLCVTLDCTDLN NTTNTNNTTNTNSSKIE-GGlMKNCSFNITTNR 

ENV_HV1H2 CVKLTPLCVSLKCTDLKNDTN TNSSSGRMI ME-KgIiKNCSFNISTSI 

Q7ZJC0_9HIV1 CVKLTPLCVTLNCTDYTNTTN TTNTSSTVSGEKMD-RG|iKNCSFNITTNI 

Q6EFN0_9HIV1 CVKLTPLCVTLNCTDDLGNNS TNNSSW D-KGEMKKCSFNITTSI 

B0FBB4_9HIV1 CVRLTPLCVTLNCSNYAGTNTT A INTNTTVWGEKMD-PGBIKNCSFNIATPI 

Q5G5V0_9HIV1 CVKLTPLCVTLSCTDNVGNDTS TNNSRW-DKHE-KGEIKNCSFNITTNM 

B0FBB9_9HIV1 CVKLTPLCVTLNCADWKNNTDT NTNSSVR — IME-KGBlKNCSFNITTNI 

Q5G5U9_9HIV1 CVKLTPLCVTLNCTDLVNSNIT RVDNTT EKBHKNCSFNVTSGI 

Q5G5U6_9HIV1 CVKLTPLCVTLNCTDWTNGTDW NTTNSNNTTISKEETIE-GGBMKNCSFNITTAT 

B0FBJ5_9HIV1 CVKLTPLCVTLNCTNV-NVTNL KNETNTNSSSGGEK-ME-EGBMKNCSFNVTTLI 

A4ZPX1_9HIV1 CVKLTPFCVTLNCTNY-NGTN GNTTVTNNSTG DGDIKQCSFNITTEI 

B0FBD9_9HIV1 CVKLTPLCVTLNCTDLKNSATD TNGTSGTNNRTVEQGME-T-BlKNCSFNITTGI 

Q5G5V5_9HIV1 CVKLTPLCVTLNCTDELRNGTYANVTVTEK Gfl IKNCSFN ITTAI 

B0FBI6_9HIV1 CVKLTPLCVTLNCTDV — NGTSANVTSIEK g|iKNCSFNITTTI 

Q5G5V7_9HIV1 CVKLTPLCVTLNCTDEV-KTSYANKTSNETYKTSNETF GfllKNCSFSVPTGI 

A4ZPW4_9HIV1 CVKLTPLCVTLNCTDRLNATRTNATTDA HIT n|)IKNCSFNVTASI 

Q5G5V1_9HIV1 CVKLTPLCVTLNCTDNITNTNTNSSKNSSTHSYNNSLE GgMKNCSFNITAGI 

Q5G5U5_9HIV1 CVKLTPLCVTLHCTNVTISSTNGSTANVTMRE jSMKNCSFNTTTVI 

A4ZPX2_9HIV1 CVKLTPLCVTLNCTTPAL NCTTNNNATNSNDTSSTNNTSCERMEk|iKNCSFEVTAGI 

A4ZPW8_9HIV1 CVKLTPLCVTLNCTKITM KSTNNTNT NTSWER-GEBMKNCSFNVSSSV 

A4ZPW3_9HIV1 CVKLTPLCVTLNCTDVN TTRTNSTSPRNTSISNDTFIEDIG-ljlKNCTFNVTGGI 

A4ZPX3_9HIV1 CVKLTPLCVTLSCTDVNS TDTKGNKT GSNSTSGEMMEK--GjBlKNCSFNATTSH 

A4ZPW7_9HIV1 CVKLTPLCVALNCSALAPSLCTNNVTNANSTTSNSTTNATNTDCENLRRBMTNCSFNATTGI 

A4ZPW6_9HIV1 CVKLTPLCVTLNCTTANANSTANATSTTSTTST DHKNCSFNITTDM 

Q8JDN0_9HIV1 CVKLTPLCVTLNCTNVNNNTTN VNNNTGWDE|RKNCSFNITTEL 

B2ZSE9_9HIV1 CVJLTPLCVTLKCTNVTSSN NGTVGKTEDMgNCSFNITTIV 
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RDKK 
RDKK 
RDRR 
KDKK 
RDKR 
RDKK 
RDKS 
RDKR 
RDKK 
RDKK 
KDKV 
KDRT 
KDKA 
NGRG 
SDKM 
RDKV 
RDKV 
KGKH 
RDKG 
KDKV 
RNKM 
RGKV 
RDRM 
RDKI 
RGRK 
RDKV 
GDKR 
RGKV 
RNKM 
RGKH 
KDKR 
RDKM 
RDKY 
RDKV 
GDK- 
RNKR 
EDKD 
GNKM 
RDKV 
RDKV 
KDKV 
RDKV 
RDKV 
RDKI 
IDRP 
RDKM 
RDKI 
QDKI 
TNKM 
RDKV 
RDKR 

EDEK 





QKVHALFYRLDIVPLN D TEKKNSSRPYRLINCNTSAITQAC 

QKVHALFYRLDIVPLNKRNSSE SEEENSSGYYRLINCNTSAVTQAC 

QKVSALFYRLDIVPLNR SSSSNSSDYYRLISCNTSAITQAC 

KTVYSLFYRLDIVQLDGR SNTSNYRLINCNTSTITQAC 

KKVHALFYRLDIVSLEKDNS SKKNDSNEYYRLINCNTSAITQAC 

QKVYALFYRLDIVPLEEERKGN S SKYRLINCNTSAITQAC 

QKAYALFYRLDLVPLERENRGD SNS ASKYILINCNTSAITQAC 

QTVYASFYKLDIVPLNENKSTS BE NYRLINCNTSAITQAC 

QSVYALFYRLDIVPLN NSNEYYRLINCNTSAIKQAC 

KQVYALFYKLDIVPLDGR NNSSEYRLINCNTSTITQAC 

KKEYALFYKLDWALDGKE TNSTNSSEYRLINCNTSAVTQAC 

QKAYALFYKPDWPLNRREE NNGTGEYILINCNSSTITQAC 

QKVYALFYKPDWPLNGGEH NE-TGEYILINCNSSTITQAC 

QKVYALFYRPDWPL NENSSSYILINCNTSTTTQAC 

QKVYALFYRTDWPLNNNN NNSEYILINCNTSTITQAC 

QKAYALFYRPDWPLNKNSP SGNSSEYILINCNTSTITQAC 

KNMRALFYRADIEPLDGNSNE SINSSEGDKYILINCNTSTIAQAC 

RKVNVLFYKLDLVPLTNSSN TTNYRLISCNTSTITQAC 

KKEYALFYRLDIVPLKNESE SQNFSE YILINCNTSTIAQAC 

QKKYALFYKLDIVPLDDNDN ASYRLINCNTSTLTQAC 

TKQRALFYKLDWPLEEEKNS SSKNSSYKEYRLISCNTSTITQAC 

QKVNATFYDLDIVPLSSSDNS SNSSL YRLISCNTSTITQAC 

QKVYALFYKLDILPLNGNND SNEYRLINCNTSAITQAC 

RREHALFYKLDIVQLNDEGN DSYSYRLINCNTSTIKQAC 

QKVYALFYRLDIVPLTEKNSSE NSS KYYRLINCNTSAITQAC 

KKVDALFYKLDIVPLHKKKNSK NNSI EYYRLINCNTSAITQAC 

QTVYALFYRLDIVPLDEKSSSE NSR GYYRLINCNTSAITQAC 

QKVSALFYRLDIVPLTEKESSE NSS GYYRLINCNTSAITQAC 

QKVHALFYRLDIVPLDEKKNSS ENSS ESYRLINCNTSAITQAC 

QNVYALFYRLDIVPLAENSSE YRLINCNTSAITQAC 

KSEYALFYRLDIVPLDENSNE YILINCNTSAITQAC 

KTEYALFYRLDIVPLDRENSSR NS — RE YRLINCNTSAITQAC 

QTVYALFYRLDWPLTNENSSK NSSNS — SEYRLINCNTSAITQAC 

QSVYALFYRLDWPITEKNSSK Y YRLINCNTSAITQAC 

QKVHALFYRLDIVPLNESNKNS RK YRLINCNTSAITQAC 

KKEYAFFYKQDIVQIN NN ESSTNNNNESHLISCNTSTITQAC 

KKEYAFFYKQDIVQI NNNESMLISCNTSTITQAC 

KKMYALFNRYDWQISEQ-N-NS QSDNSTDREYMLTSCNASTITQAC 

KKVYALFYRPDVIEIN KT KINNSNSSQYHLINCNTSAITQAC 

KNEYALFYRQDWQI NETDNSTYRLINCNTSTLTQAC 

QKMDALFHRKDWQI SNGNNSYYMLISCNTSTMAQAC 

KKEYALFYRQDIVPLDTNGT-NI DTNGTNSSQYMLINCNTSTIAQAC 

QKVSALFYRLDWQINES GNSQYRLINCHTSAITKAC 

RQIHSLPYKLDWQINEN NSQQYRPINCNTSAITQAC 

KKEYSFFYKTDIEQINKN GRQYRLINCNTSAITQAC 

KKVYSLFYKLDWPLSNDS SNTQYRLINCNSSTITQAC 

QKVSAFFYRQDWPVNSNQ DNSSYRLINCNTSAITQAC 

QKVYALFYRQDVIQ-NGNN NSSYYRLINCNTSAMTQAC 

KKEYALFYRQDWQFNETE NNSSQSHYMLINCNTSTIAQAC 

RKEYSLFYKLDIVLVNESN SNSSQ-- YRLINCDTSTITQAC 

KKVYS-FYKLDIVKIR ENNNSDNYTYRLINCNTSTITQAC 

KKVYALFYRLDVIEINDTSSSNS DNSSSTQYNYRLINCNTSAITQAC 

KQEYALFYKLDIVKMDD KSNDSLYRMINCNVSTIKQAC 

QKVHSLFYRLDIVQI NNSE YRLINCNTSAITQAC 

QKVHSLFYRLDWPMGGK NDSQYRLINCNTSAITQAC 

QKVYSLFYKLDWQM NSSQYRLINCNTSAIKQAC 

QKVFSLFYRLDIVEIENNRT NNRTNNTEYRLINCNTSAITQAC 

QNVYSLFYRLDWPLET-KT NQNSSHSRYRLINCNTSAITQAC 

QKVYSLFYKLDIVQIN--ED QGNSSNNKYRLITCNTSAITQAC 

QKVYSLFYRLDIVQISNSSN SSE YILINCNTSTLTQAC 

KQVYSLFYRLDIVPIDGSDN SSDNS-NNYRLINCNTSAITQAC 

QKVYSLFYRLDIVPINENQG S E YRLINCNTSAITQAC 

QKVYSLFYRLDIVQMNEGNN SSNNSANE YRLINCNTSAITQAC 

QKVYSLFYKLDWQISENNS SNSSNFTQYRLINCNTSAITQAC 

QKEYALFYRLDIVPIDNTS YRLVKCNTSVITQAC 

RKEYALFYKLDWPINDTR YRLVSCNTSVITQAC 

QKEYALFYKLDIAQIDNSTG YRLISCNTSVITQAC 

QKEYAVFQKLDWPIENNNGSNN TYRLISCNTSWTQAC 

QKEYALFYTLDWPIVKENN TYRLISCNTSVITQAC 

KREYALFYNLDWKLEEGET SYRLVSCNTSWTQAC 

QKEYALFYKLDIVPITNESS KYRLISCNTSVLTQAC 

QKEYALFYKLDIVPIDNSS YRLISCNTSVITQAC 

QKEYALFYKTDWPIDND NTSYRMISCNTSVITQAC 

QKEYALFYRTDWPINDDVKNNNNDSVKNSTKYTNYRLINCNTSVITQAC 

KREYALFYSLDIVPIDND NTRYRLRSCNTSIITQAC 

QKEYALFYELDIVPIDNKI DSYRLISCNTSVITQAC 

QKVYALFYRPDWPIQDHTIENNN TIENNTTYRLISCNTSVITQAC 

QKEYAIFYKQDWPIKNDNI SYRLISCNTSVITQAC 

KTEYATFYETDLVLINDDNT TSYRLISCNTSVIKQAC 

RKEYALLYKIDLVSIDGSN TSYRMISCNTSVITQSC 

QKEYALLYRTDIVSIENTS SSYRLISCNTSVITQAC 

QKEYAFFYKLDIIPID--NDT TSYKLTSCNTSVITQAC 

QRTYALFYKLDVEPIDKNKNT TRYRLISCNNSVITQAC 

QKEYALFYKLDIVPIDNGK NDSTN-TSYRLINCNTSVITQAC 

HQEYALFYKSDWPID EDNDT-TSYRLISCNTSVITQAC 

QKQYALFYKLDWPIEEGK NNNSSFTDYRLISCNTSVITQAC 

QKAYALFYKLDWPIDDDNA TGNNDTRNYRLISCNTSVITQAC 

QKEYALLYKLDIVQIDNDN TSHRDNTSYRLISCNTSVITQAC 

KKERAFFYKLDVAPID NSNTSYRLISCNTSVITQAC 

KTEYALFYKLDVMPID HDNTSYTLINCNSSTITQAC 

RNESALFSKLDLEPIN NSNTSYRLIKCNTSVIKQAC 

QKEYALFYKLDWPIDSNN NSDNTSYRLISCNTSWTQAC 

QKTYALFYHLDWPIDNNHGNSSSN YSNYRLINCNTSVITQAC 

QKAYATFYSLDWPIDNDQDNSS SSYRLTNCNTSVITQAC 

QNVYALFYKLDVIPIDDNNNSSKNNN GSYSSYRLINCNTSVITQAC 

KTEYALFYSLDWPIE DNDN TSYRLRSCNTSVITQAC 

KKEYALFYKLDWPIEEDKDTNK TTYRLRSCNTSVITQAC 

QKEYALFYKLDIVPIEGKNTNTG YRLINCNTSVITQAC 

QKDYALFYSLDIVPIEGTN GSYRLINCNTSVIKQAC 

KREYALFYKLDIVPIDDSNSSSSNY SSYRLINCNTSVITQAC 

QKEYALFYRLDWKIDNSDNS SNYRLVSCNTSVITQAC 

KKNYALFYSLDIVPIDNTS YRLISCNTSI ITQAC 

KKEYALFDRLDVIPIDNSSI YALINCNTSVIKQAC 

QKQHALFYSLDWPIDNSNNS YRLISCNTSVITQAC 

QKVYSLFYKLDWQI--DNS S YRLINCNTSAITQAC 

KQEYALFYRLDIVEINPNDT SYRLINCNTSAITQAC 
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Figure 3. Multiple alignment of 106 sequences of VI /V2 domains in HIV gp120. Sequence profile is shown above the alignment: stack of 
amino-acid residue letter codes above each position indicates frequencies (taller letters - more frequent), with most frequent one at the bottom of 
the stack. Secondary structure cartoon is shown below the alignment as assigned in the X-ray structure (PDB 3U4E). Also indicated at the bottom are 
the three 5-residue segments (red intervals) that represent BSP/NS 'hotspots'. 
doi:10.1371/journal.pone.0094002.g003 



4 segments (centered at L130,C131,P183 and A200) reached p- 
values somewhat below 0.05 cutoff (see Figure SI). 

Remarkably, the two pronounced peaks of beta-strand propen- 
sity correlation corresponded well to the C-terminal parts of 
adjacent beta-strands (B and C) observed in the available X-ray 
structures (Protein Data Bank IDs 3U4E and 3U2S). It is therefore 



plausible that propensities from both segments combine to 
determine the overall stability of the beta-hairpin structure formed 
by the two strands, perhaps regulating the length of the stably 
folded part versus less ordered connecting loops. Indeed, total BSP 
from the central 32-residue V1/V2 segment (E153-I184) showed 
Pierson correlation coefficient with neutralization logio ID50 of 
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Figure 4. Plots of Pierson correlation coefficients R and p-values for the BSP/NS correlation (top and middle, respectively). BSPs are 
calculated for 5 amino-acid segments centered on each position within the three conserved stretches of V1/V2 and its stem. Gaps in the plot 
correspond to the two hyper-variable regions that aligned poorly and were excluded from the analysis (see also Fig. 3). Also shown are the secondary 
structure and three 'hotspot' segments (see legend of Fig. 3). 
doi:1 0.1 371/journal.pone.0094002.g004 
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0.35, similar to that of the total BSP of the entire conserved VI/ 
V2 domain above, and p-value of 0.00002 - actually better than 
that for the entire domain. Third weaker peak was in the middle of 
the strand D, extending into VI /V2 stem. 

Stronger correlation could be achieved if contributions from the 
top three independent 5 residue segments were combined (1161- 
1165, F175-L179 and S195-S199), with R = 0.47 (confidence 
interval 0.35-0.57) and p<0. 000001 (p-value for the correlation 
here is not strictly meaningful since the choice of contributing 
segments is done a posteriori). The correlation plot logio ID50/ 
AGbsp(ii61-ii65, F175-L179, .s ig.M 1 99) IS sliowu ou Fig 3. Ou the other 
hand, no detectable signal is seen in N-terminal strand A. We can 
hypothesize that this strand is more stable, possibly due to two 
disulfide bridges, and does not experience significant structural 
variability. 

Persistence of the correlation within different clades of 
HIV 

Robustness of the correlation was investigated by partitioning 
the data into subsets according to virus clades: the 106 viruses 
panel contained 40 clade B, 24 clade C, 1 1 in CRF (circulating 
recombinant form) BC and 1 1 in CRF AG. Other clades and 
forms had only a few representatives each and were combined in a 
fourth group of 20 viruses, mosdy related to clade A. Significant 
correlation was still present in the first two groups, with strongest 
correlation in clade B, and somewhat weaker in clade C (Table 1). 
It should be noted that to reach statistical significance (p- 
value<0.05) at expected R~0.4 the dataset generally needs to 
have >18 points [39]. Thus CRF BC also had comparable R 
value but statistical significance could not be reached likely due to 
the smaller size of the group. Correlation values were in the same 
range for combinations of these three groups, indicating that 
underlying relationship holds within and across the two clades and 
their CRF. Thus observed correlation did not arise from possible 
trivial co-variations with sequence and neutralization sensitivity 
differences between clades. On the other hand, CRF AG and the 
fourth group that combined other clades, exhibited only weak, 
statistically non-significant correlation. It is possible that neutral- 
ization sensitivity differences unrelated to the VI /V2 structural 
effects obscure the correlation in this genetically diverse group. 

Conclusions 

Conformational heterogeneity of VI /V2 region was recentiy 
demonstrated by the X-ray investigations of PG9/PG16, CH58 

Table 1. BSP/neutralization sensitivity correlations within 
clade subgroups of the HIV virus panel. 



Number of 



Clade 


viruses 


R (p-value) 


All 


106 


0.47 (<0.000001) 


B 


40 


0.48 (0.002) 


c 


24 


0.4 (0.05) 


CRF BC 


11 


0.4 (0.2) 


B + C 


64 


0.5 (0.000005) 


B + C + CRF BC 


75 


0.51 (<0.000001) 


CRF AG 


11 


0.25 (0.4) 


All other (mostly A and related) 


20 


0.2 (0.4) 


doi:l 0.1 371 /journal.pone.0094002.t001 



and CH59 mAb complexes with their epitopes: the C-straiid of 
PG9 complex that encompasses the key residues involved in 
binding to all three niAbs exhibits beta-strand secondary structure 
in the first case, mostly alpha-helical secondary structure in the 
second case, and coiled conformation in the third. While this 
ability to adopt different conformations was so far directly 
demonstrated only in antibody binding, it may also have a 
functional role, i.e. the conformation may change during the 
transitions associated with attachment and fusion, and prevalence 
of different conformations may also vary from virus strain to virus 
strain or from clade to clade. Herein it is demonstrated that a 
simple sequence-based measure of the propensity to form beta- 
structure within B/C hairpin region of the VI /V2 domain 
significantiy correlates with neutralization sensitivity of the virus. 

Modulation of this propensity likely either triggers switch from 
one conformation to another or affects equilibrium between 
multiple conformations that the domain can adopt. Another 
possibility is that beta-sheet propensity controls order/disorder 
transition or equilibrium within VI /V2 domain. Functional 
importance of intrinsic disorder in proteins [40] and more 
specifically viral proteins [41,42] is increasingly gaining recogni- 
tion. A range of mechanisms by which the conformational changes 
in the VI /V2 domain affect neutralization sensitivity can be 
proposed. Recent low-resolution structural studies of the gpl20 
trimers indicate that VI /V2 domains localize near the axis of the 
spike and therefore the three domains likely contact and interact 
with each other. These interactions may influence overall 
configuration of the trimer, making it more ^open' or 'closed' 
and modulating accessibility of multiple unrelated epitopes. On 
the other hand, parts of loops within VI /V2 domain and their 
glycan decorations may extend over other immunogenic regions of 
gpl20 and shield them from antibody access. V3 region in 
particular has been shown to be subject to 'masking' effects that 
are largely VI /V2 mediated [14—16,43]. We can hypothesize that 
less-ordered VI /V2 domain may be effectively bulkier and block 
access to various other epitopes stronger than when it is tightiy 
folded. Finally, V 1 / V2 in itself is an important neutralizing Ab 
target and some of the observed changes in neutralization 
sensitivity to immune serum may simply reflect differences in 
presentation of the intrinsic V 1 / V2 neutralizing epitopes. 

Clearly, the described effect of V1/V2 beta-strand propensity is 
not the only factor that determines the neutralization resistance of 
HIV virus, as the wide spread of the correlation plot (Fig. 5) 
indicates. It has been reported that a single position mutation 
D179N could convert highly neutralization resistant virus into a 
sensitive one [9]. Presence or absence of glycans at certain 
positions was also shown to play a role [5]. Mutations outside VI/ 
V2 have been demonstrated to affect global sensitivity as well [10]. 
Nevertheless, all previously reported mutation data was of singular 
nature without a clear common trend. Generality of the observed 
effect of secondary structure propensity points towards a common 
conformational mechanism of neutralization sensitivity modula- 
tion applicable to many HIV virus strains. To establish how local 
secondary structure preferences affect global neutralization sensi- 
tivity further structural investigations of V1/V2 region in relation 
to different strains of HIV will be needed. The fmding may have 
implications for HIV vaccine design, which increasingly incorpo- 
rates structural considerations. 

Methods 

Sequences and alignment 

Gpl20 sequences of 106 HIV isolates on the tiered neutrali- 
zation assessment panel [38] were extracted from the GeneBank. 
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Figure 5. Correlation plot of HiVIG ID50 versus tlie total BSP calculated across the three 'correlation hotspot' segments (15 
positions, amino-acids 161:165,175:179 and 195:199). 

doi:1 0.1 371 /journal.pone.0094002.g005 



The panel contains 109 viruses, however the three 'Tier lA' 
viruses (MW965.26, SF162.LS and MN/H9) were excluded as 
clear oudiers with neutralization titers orders of magnitude outside 
the general range. Sequences were aligned and the VI /V2 region 
sub-alignment extracted in ICM [44,45] (Fig. 3). Two segments of 
V1/V2 domain, amino-acids (AA) T132-G152 and D185-S190 
(numbering and AA residues here and throughout the paper follow 
the em sequence of HIVl strain HXB2) are extremely variable 
both in composition and length. They were excluded from the 
analysis. Also excluded were two short inserts present only in one 
sequence each (4 AAs between positions 169/170 in H078.14 and 
1 AA between positions 165/166 in 9021.14.B2.4571). Because X- 
ray structures of V1/V2 domain show that C- and N- terminal 
secondary structure elements (beta strands A and D) extend 
through the C126-C196 disulfide bridge that formally separates 
VI /V2 domain itself and the so-called 'stem' (independentiy 
observed forming beta-strand in other gpl20 X-ray structures), 
sequence segments belonging to the stem were also included in the 
analysis (up to the next disulfide C119-C205). As seen in the 
results, only minor signal was observed outside the VI /V2 domain 
proper. 

Neutralization data 

As a measure of neutrahzation sensitivity (NS) of the virus, 
neutralization data from ref [38] was used and represented loglO 
ID50 titers of HIVIG (HIV immune globulin) in Hg/ml, thus 
higher titer numbers corresponded to lower sensitivity. HIVIG is a 
purified HIV-H Ig reagent that is obtained from the NIH AIDS 
Research and Reference Reagent Program. HIVIG is prepared 
from pooled plasma of asymptomatic, HIV antibody positive 



donors. Titer values varied from 28 to >2500 |.ig/ml (the titers 
exceeding 2500 were not measured precisely, only the censored 
value is available). Three excluded 'Tier lA' strains (see above) 
had titers of <0.02, 6 and 7 |ig/ml. Log of the HIVIG 
concentration rather than the concentration itself was used to 
calculate correlations with structural propensities, because loga- 
rithm of concentration should linearly relate to the binding free 
energy. Structural propensities reflect free energy involved in 
secondary structure formation, thus physically homogeneous 
parameters of the system were being correlated. 

Structural propensities 

Smith, Withka & Reagan [36] scale of beta-sheet propensities 
was used. The scale is based on protein stability changes upon 
Ala/Xxx substitutions, experimentally measured via melting 
temperature. This scale was chosen because the AAG (kcal/mol) 
values are expected to directly reflect changes in stability of beta- 
sheets associated with a substitution. Pace & Scholtz [37] scale was 
used for alpha-helix propensity. 

Initially, total beta-sheet propensity over the conserved elements 
of V1/V2 domain were evaluated for each env sequence: 

p„{j)=i:p{Sj{i)) 

where Sj(i) is the amino acid at i-th position in j-th sequence 
according to the alignment, p{) is the appropriate propensity 
according to the scale and summation is over all weU-aligned 
positions. Highly variable segments with insertions/deletions in 
many sequences were excluded from the analysis, leaving 60 well- 
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aligned positions witliin N-terminal (in HXB2 numbering AA 
C119-C131), central (E153-I184), and C-terminal (Y191-C205) 
conserved elements. 

Total beta-sheet or alpha-helix propensities for 5-residue 
segments within a window sliding along VI /V2 domain sequence 
were calculated: 

p{Sj{i^2)) +p{Sj{i-\)) +p{Sj{i)) +p{Sj{i+ 1)) +p{Sj{i+l)) 

Thus, vectors of propensities across 106 sequences were 
generated for each position within the alignment of VI /V2 
domains. Pierson correlation coefficient of logio ID50 with total 
propensity and 5-residue segment propensities for each position 
was calculated. Randomization test [46] was used to estimate p- 
values of the correlations (1,000,000 samples). To control errors 
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